{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Analyze Training Results\n",
    "\n",
    "## Pre-Requisite: _You must run the previous notebook to generate debugger artifacts._"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "%config InlineBackend.figure_format='retina'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%store -r training_job_debugger_artifacts_path"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    training_job_debugger_artifacts_path\n",
    "except NameError:\n",
    "    print('++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++')\n",
    "    print('[ERROR] Please wait for the previous notebook to finish.')\n",
    "    print('++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(training_job_debugger_artifacts_path)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Analyze Tensors"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before getting to analysis, here are some notes on concepts being used in Debugger that help with analysis.\n",
    "* **Trial** - object that is a center piece of Debugger API when it comes to getting access to tensors. It is a top level abstract that represents a single run of a training job. All tensors emitted by training job are associated with its trial.\n",
    "* **Step** - object that represents next level of abstraction. In Debugger - step is a representation of a single batch of a training job. Each trial has multiple steps. Each tensor is associated with multiple steps - having a particular value at each of the steps.\n",
    "* **Tensor** - object that represent actual tensor saved during training job. Note, a tensor can be a 1-D scaler, as well (ie. loss is stored as a scalar).\n",
    "\n",
    "For more details on these concepts as well as on Debugger API in general (including examples) please refer to Debugger Analysis API documentation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from smdebug.trials import create_trial\n",
    "\n",
    "# this is where we create a Trial object that allows access to saved tensors\n",
    "trial = create_trial(training_job_debugger_artifacts_path)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for i in trial.tensor_names():\n",
    "    print(i)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "def get_data(trial, tensor_name, batch_index, steps_range, mode):\n",
    "    tensor = trial.tensor(tensor_name)\n",
    "    vals = []\n",
    "    for step_num in steps_range:\n",
    "        val = tensor.value(step_num=step_num, mode=mode)[batch_index]\n",
    "        vals.append(val)\n",
    "    return pd.DataFrame(columns=['steps', tensor_name], data=list(zip(steps_range, vals)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Get Tensors"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import time\n",
    "\n",
    "steps = 0\n",
    "while steps == 0:\n",
    "    # trial.steps return all steps that have been downloaded by Debugger to date.\n",
    "    # It doesn't represent all steps that are to be available once training job is complete -\n",
    "    # it is a snapshot of a current state of the training job. If you call it after training job is done\n",
    "    # you will get all tensors available at once.\n",
    "    steps = trial.steps()\n",
    "    print('Waiting for tensors to become available...')\n",
    "    time.sleep(3)\n",
    "print('\\nDone')\n",
    "\n",
    "print('Getting tensors...')\n",
    "rendered_steps = []\n",
    "\n",
    "# trial.loaded_all_steps is a way to keep monitoring for a state of a training job as seen by Debugger.\n",
    "# When SageMaker completes training job Debugger, and trial, becomes aware of it.\n",
    "\n",
    "loaded_all_steps = False\n",
    "while not loaded_all_steps:\n",
    "    loaded_all_steps = trial.loaded_all_steps\n",
    "    steps = trial.steps()\n",
    "\n",
    "print('Done!')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Visualize Loss"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from smdebug.tensorflow import modes\n",
    "import time\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "%config InlineBackend.figure_format='retina'\n",
    "\n",
    "#from matplotlib import pyplot as plt\n",
    "\n",
    "import pandas as pd\n",
    "\n",
    "# Let's visualize weights of the first convolutional layer as they progressively change through training.\n",
    "tensor_name = 'loss'\n",
    "\n",
    "num_batches = trial.tensor(tensor_name).value(step_num=steps[0]).shape[0]\n",
    "for batch_index in range(0, num_batches):\n",
    "    steps_range = trial.tensor(tensor_name).steps()\n",
    "    print(steps_range)\n",
    "    data = get_data(trial=trial, \n",
    "                    tensor_name=tensor_name, \n",
    "                    batch_index=batch_index, \n",
    "                    steps_range=steps_range, \n",
    "                    mode=modes.GLOBAL)\n",
    "    print(data)\n",
    "    data.plot(x='steps', y=tensor_name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Visualize Accuracy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from smdebug.tensorflow import modes\n",
    "import time\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "%config InlineBackend.figure_format='retina'\n",
    "\n",
    "import pandas as pd\n",
    "\n",
    "tensor_name = 'accuracy'\n",
    "\n",
    "num_batches = trial.tensor(tensor_name).value(step_num=steps[0]).shape[0]\n",
    "for batch_index in range(0, num_batches):\n",
    "    steps_range = trial.tensor(tensor_name).steps()\n",
    "    print(steps_range)\n",
    "    data = get_data(trial=trial, \n",
    "                    tensor_name=tensor_name, \n",
    "                    batch_index=batch_index, \n",
    "                    steps_range=steps_range, \n",
    "                    mode=modes.GLOBAL)\n",
    "    print(data)\n",
    "    data.plot(x='steps', y=tensor_name)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%javascript\n",
    "Jupyter.notebook.save_checkpoint();\n",
    "Jupyter.notebook.session.delete();"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_python3",
   "language": "python",
   "name": "conda_python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
